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Abstract. Configuring consists in simulating the realization of a com- 
plex product from a catalog of component parts, using known relations 
between types, and picking values for object attributes. This highly com- 
binatorial problem in the field of constraint programming has been ad- 
dressed with a variety of approaches since the foundation system Rl[5]. 
An inherent difficulty in solving configuration problems is the existence 
of many isomorphisms among interpretations. We describe a formalism 
independent approach to improve the detection of isomorphisms by con- 
figurators, which does not require to adapt the problem model. To achieve 
this, we exploit the properties of a characteristic subset of configuration 
problems, called the structural sub-problem, which canonical solutions 
can be produced or tested at a limited cost. In this paper we present an 
algorithm for testing the canonicity of configurations, that can be added 
as a symmetry breaking constraint to any configurator. The cost and 
efficiency of this canonicity test are given. 



1 Introduction 

Configuring consists in simulating the realization of a complex product from a 
catalog of component parts (e.g. processors, hard disks in a PC ), using known 
relations between types (motherboards can connect up to four processors), and 
instantiating object attributes (selecting the ram size, bus speed, ...). Con- 
straints apply to configuration problems to define which products are valid, or 
well formed. For example in a PC, the processors on a motherboard all have the 
same type, the ram units have the same wait times, the total power of a power 
supply must exceed the total power demand of all the devices. Configuration ap- 
plications deal with such constraints, that bind variables occurring in the form 
of variable object attributes deep within the object structure. 

The industrial need for configuration applications is widespread, and has 
triggered the development of many configuration applications, as well as generic 
configuration tools or configurators, built upon all available technologies. For 
instance, configuration is a leading application field for rule based expert systems. 



As an evolution of Rl[5j, the XCON system [3j designed in 1989 for computer 
configuration at Digital Equipment involved 31000 components, and 17000 rules. 
The application of configuration is experimented or planned in many different 
industrial fields, electronic commerce (the CAWICOMS project^]), software |19|. 
computers^2]j electric engine power supplies]?] and many others like vehicles, 
electronic devices, customer relation management (CRM) etc. 

The high variability rate of configuration knowledge (parts catalogs may 
vary by up to a third each year) makes configuration application maintenance 
a challenging task. Rule based systems like Rl or XCON lack modularity in 
that respect, which encouraged researchers to use variants of the CSP formal- 
ism (like DCSP structural CSP P], composite CSP [H]), constraint 
logic programming (CLP (i . CC 3[, stable models ^H]), or object oriented 
approaches |8I12| . 

One difficulty with configuration problems stems from the existence of many 
isomorphisms among interpretations. Isomorphisms naturally arise from the fact 
that many constraints are universally quantified (e.g. "for all motherboards, it 
holds that their connected processors have the exact same type" ) . This issue is 
technically discussed in several papers |8"ll 81 17) . The most straightforward ap- 
proach is to treat during the search all yet unused objects as interchangeable. 
This is a widely known technique in constraint programming, applied to con- 
figuration in |8I17| e.g.. However, this does not account for the isomorphisms 
arising during the search because substructures arc themselves isomorphic (e.g. 
two exactly identical PCs with the same motherboards and processors are inter- 
changeable). 

The work in 8 , implemented within the ILOG 1 commercial configurators, 
suggests to replace some relations between objects with cardinality variables 
counting the number of connected elements for each type. This technique is 
very efficient and intuitively addresses many situations. For instance, to model a 
purse, it suffices to count how many coins of each type it contains, and it would 
be lost effort to model each coin as an isolated object. This solution has two 
drawbacks : it requires a change in the model on one hand, and the counted 
objects cannot themselves be configured. Hence the isomorphisms arising from 
the existence of isomorphic substructures cannot be handled this way. 

[T8"| applies a notion called " context dependant interchangeability" to config- 
uration. This is more general than the two approaches seen before, but applies 
to the specific area of case adaptation. Also, since context dependant inter- 
changeability detection is non polynomial, TH] only involves an approximation 
of the general concept. Furthermore, the underlying formalism, standard CSPs, 
is known as too restrictive for configuration in general. 

One step towards dealing with the isomorphisms emerging from structural 
equivalence in configurations is to isolate this " structure" . and study its isomor- 
phisms. This is the main goal pursued here : we propose a general approach 
for the elimination of structural isomorphisms in configuration problems. This 
generalizes already known methods (the interchangeability of " unused" objects, 

1 http://www.ilog.fr 
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as well as the use of cardinality counters) while not requiring to adapt the con- 
figuration model. After describing what we call a configurations's structural sub- 
problem, we define an algorithm to test the canonicity of its interpretations. This 
algorithm can be adapted to complement virtually any general purpose configu- 
ration tool, so as to prevent exploring many redundant search sub-spaces. This 
work greatly extends the possibilities of dealing with configuration isomorphisms, 
since it does not require a specific formalism. The complexity of the canonicity 
test and the compared complexity of the original problem versus the resulting 
version exploiting canonicity testing are studied. 

The paper is structured as follows : section 2 describes configuration prob- 
lems, and the formalism used throughout the paper. Section 3 defines structural 
sub-problems, and their models called T-trees. In section 4, we describe T-tree 
isomorphisms and their canonical representatives. Section 5 presents an algo- 
rithm to test the canonicity of T-trees. Then section 6 lists complexity and 
combinatorial results. Finally, 6 concludes and opens various perspectives. 

2 Configuration problems, and structural sub-problems 

A configuration problem describes a generic product, in the form of declarative 
statements (rules or axioms) about product well-formedness. Valid configuration 
model instances are called configurations, are generally numerous, and involve 
objects and their relationships. There exist several kinds of relations : 

— types : unary relations involved in taxonomies, with inheritance. They are 
central to configuration problems since part of the objective is to determine, 
or refine, the actual type of all objects present in the result (e.g. : the program 
starts with something known as a " Processor" , and the user expects to obtain 
something like " Proc_Brand Speed" ) . 

— other unary relations corresponding to Boolean object properties (e.g. : a 
main board has a built in scsi interface) 

— binary composition relations (e.g. : car wheels, the processor in a mainboard 
. . . ). An object cannot act as a component for more than one composite. 

— other relations : not necessarily binary, allowing for loose connections (e.g. : 
in a computer network, the relation between computers and printers) 

Configuration problems generally exhibit solutions having a prominent structural 
component, due to the presence of many composition relations. Many isomor- 
phisms exist among the structural part of the solutions. We isolate configuration 
sub-problems called structural problems, that are built from the composition re- 
lations, the related types and the structural constraints alone. By structural 
constraints, we precisely refer to the basic constraints that define the structure : 

— those declaring the types of the objects connected by each relation 

— the constraints that specify the maximal cardinalities of the relations (the 
maximal number of connectable components) 
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To ensure the completeness of several results at the end of the paper, we enforce 
two limitations to the kind of constraints that define structural problems : min- 
imal cardinality constraints are not accounted for at that level (they remain in 
the global configuration model), and the target relation types are all mutually 
exclusive 2 . 

For simplicity, we abstract from any configuration formalism, and consider 
a totally ordered set O of objects (we normally use O = {1,2, . . .}), a totally 
ordered set Tc of type symbols (unary relations) and a totally ordered set Rc 
of composition relation symbols (binary relations). We note -<o, ^t c an d ^r c 
the corresponding total orders. 

Definition 1 (syntax). A structural problem, as illustrated in figure^ is a 

tuple (t,Tc, Rc,C), where t £ Tc is the root configuration type, and C is a set 
of structural constraints applied to the elements of Tc and Rc ■ 



t = PC 

Tc = {PC, Monitor, Supply, Mainboard, Processor, HDisk} 
Rc = {PC-Monitor, PC-Supply, PC-Mainboard, Mainboard- 
Processor, Mainboard-HDisk} 

C = { Vx,y PC-Monitor(x,y) -> PC(x) A Monitor(y), . . . 

Vx | {y St. PC_Monitor(x, y)} |< 2, . . . 
Vx PC(x) -> ^Monitor(x), . . . } 



Fig. 1. Structural problem example 

Definition 2 (semantics). An instance of a structural problem (t,Tc, Rc,C) 
is an interpretation I of t and of the elements of Tc and Rc, over the set O 
of objects. If an interpretation satisfies the constraints in C , it is a solution ( or 
modeL) of the structural problem. 

In the spirit of usual finite model semantics, Tc members are interpreted by 
elements of V(0), and Rc members by elements of V(0 x O) (relations). For 
instance, an interpretation of the type "Processor" can be {4,6}, which means 
that 4 and 6 alone are processors. Similarly, an interpretation of the binary 
relation " Mainboard-Processor" can be {(1,4), (2, 6)}. 

For readability reasons and unless ambiguous, in the rest of the paper we use 
the term configuration to denote a model of a structural problem. Figure |3 lists 
a sample model of the structural problem detailed in figure^ It is obvious from 
this example that object types can be inferred from the composition relations. 
We define the following : 

Definition 3 (root, composite, component). A configuration, solution of a 
structural problem (t, Tc, Rc, C), can be described by the set U of interpretations 

2 this can be compensated for by using zero max cardinality constraints in the global 
configuration problem 
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of all the elements of Rc- If Ru denotes the union of the relations in U (Rjj — 
Ureie(7 re ^ )> an< ^ ^ f denotes its transitive closure, then we have : 

1. 3! root G called root of the configuration 3 for which Vo G O (o, root) G" Rjj, 

2. Vo G O s.t. o ^ root, 3\ c G O s.t. (c, o) G Ru ; 

we call c the composite of o and o a component of c, 

3. Vo G O s.t. o 7^ root, (root, o) G Rt- 

Figure [3 lists a configuration of the problem described in figure ^ 



/(PC-Monitor) = {(1,2)}, 
/(PC-Supply) = {(1,3)}, 
J(PC-Mainboard) = {(1,4)}, 
/(Mainboard-Processor) = {(4,5), (4,6)}, 
I(Mainboard-HDisk) = {(4,7), (4,8)} 
/(PC) = {1}, ...J(HDisk) = {7,8}, 



Fig. 2. A solution of the structural problem of the figure 
3 Isomorphisms 

From a practical standpoint, as soon as two objects of the same type appearing in 
a configuration are interchangeable, it is pointless to produce all the isomorphic 
solutions obtained by exchanging them. Two solutions that differ only by the 
permutation of interchangeable objects are redundant, and the second has no 
interest for the user. It would be particularly useful for a configurator to generate 
only one representative of each equivalence class. More interestingly, the capacity 
of skipping redundant interpretations also prunes the search space from many 
sub-spaces, and was shown a key issue in other areas of finite model search 

Definition 4. We note U(rel) the relation interpreting the relational symbol 
rel G Rc in U . Two configurations U and U' are isomorphic if and only if there 
exists a permutation 9 over the set O, such that Vr G R c , 8(U)(r) — U'(r) 

3.1 Coding configurations, T-trees 

Because composition relations bind component objects to at most one compos- 
ite object, configurations can naturally be represented by trees. For practical 
reasons, we make the hypothesis that two distinct relations cannot share both 
their component and composite types 4 . Then any configuration U is in one to 
one correspondence with an ordered tree where : 

3 root unicity does not restrict generality, since this can be achieved if needed by 
introducing an extra type and an extra relation. 

4 without loss of generality : a composition relation can be replaced by two composition 
relations plus a new extra type 



5 



1. nodes are labeled by objects of O, 

2. edges are labeled by the component side type of the corresponding relation, 

3. child nodes are sorted first by their type according to ^,t c > then by their 
label according to -<o- 




Fig. 3. Two isomorphic configuration trees. 

Figure illustrates this translation by an artificial example, which shows that 
object numbers are redundant. If we suppress them, we keep the possibility 
to produce a configuration tree isomorphic to the original via a breadth first 
traversal. We hence introduce T-trees, which capture part of the isomorphisms 
that exist among configurations : 

Definition 5 (T-tree). A T-tree is a finite and non empty ordered tree where 
nodes are labeled by types and children are ordered according to -<t c ■ We note 
(T, (ci, . . . Cfc)) the T-tree with sub-trees Ci, . . . Cfe and root label T. 

To translate a configuration tree in a T-tree, we simply replace the node labels 
by their parent edge labels. Several T-tree examples are listed by the figure 
To perform the opposite operation, i.e. build a configuration tree from a T-tree, 
it suffices to generate node labels via a breadth first traversal (using consecutive 
integers, the root being labeled 0), then to relabel the edges. 

Proposition 1. Let A\ be a configuration tree, C\ the corresponding T-tree , 
and Ai the configuration tree rebuilt from C\. Then A\ and Ai are isomorphic. 

The proof is straightforward. A permutation 9 : O i— > O which asserts the 
isomorphism can be built by simply superposing A\ and A^. Since every config- 
uration bijectively maps to a configuration tree, this result legitimates the use of 
T-trees to represent configurations. This encoding captures many isomorphisms, 
because the references to members of the set O are removed, and the children 
ordering respects <t c - 
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3.2 A total order over T-trees 



Configuration trees and T-trees being trees, they are isomorphic, equal, super- 
posable, under the same assumptions as standard trees. 

Definition 6 (Isomorphic T-trees ). Let C = (T, (a\, . . . , dk)) and C = 
(T", (61, . . . be two T-trees. 

Isomorphism : C and C are isomorphic (C = C ) ifT — T', k — I and there 
exists a bijection a : {a\, . . . , au} ► {61, . . . bk} such that V« cr(aj) = bi. Iso{C) 
denotes the set of trees which are isomorphic to a T-tree C. 
Equality : C and C are equal (C = C) ifk = l,T = T', and Vi Oj = b t . 

Proposition 2. Two configurations are isomorphic iff their corresponding T- 
trees are. 

As a means of isolating a canonical representative of each equivalence class of T- 
trees, we define a total order over T-trees. We note nct(T) (number of component 
types) the number of types Tj having T as composite type for a relation in Re- 
The types T, (1 < i < nct(T)) are numbered on each node according to <t c - If 
C is a T-tree, we call T-list and we note Ti(C) the list of its children having Tj as 
a root label. |T;(C)| is the number of T-trees of the T-list T t (C). To simplify list 
expressions in the sequel, we use (a^)™ to denote the list (ai, 02, a n ). Many 
ways exist to recursively compare trees, by using combined criteria (root label, 
children count, node count, etc.). For rigor, we propose a definition using two 
orders A and <C. 

Definition 7 (The relations =<I, Ai exi <C and <^i ex )- 

We define the following four relations : A compares T-trees with roots of the same 
type T , ^iex is its lexicographic generalization to T-lists, <C compares two T-lists 
of same typeTi, and<tii ex is its lexicographic generalization to lists (Ti(C))™ ct ^ T \ 
These four order relations recursively define as follows : 

- VT e T c : (T,())<(T,Q). 

- VC, C t (T,()) :C<C ^ (T,(C))^ t(T) <^ lex (TiiC))?*™ . 

- VC, C ? (T, ()), : Ti(C) « T^C) ^ 

\Ti(C)\ < \Ti{C')\ V \Ti(C)\ = \Ti(C')\ A T^C) A lex T^C). 

In other words, each T-tree is seen as if built from a root of type T and a list of 
T-lists of sub-trees. These two list levels justify having two lexicographic orders. 
A (lines 1 and 2) lexicographically compares the lists of T-lists of two trees 
having the same root type. <C lexicographically compares T-lists (taking their 
length into account). 

Proposition 3. The relations A, Ai ex , <C and <^i ex are total orders. 

Proof. As any lexicographic order defined from a total order is itself total, it 
remains to prove that the relations A and -c are total orders. To demonstrate 
that a binary relation is a total order it suffices to show that any two elements 
from the set of reference can be compared, either one being less than or equal 
to the other. The proof is by induction on the height of T-trees. 
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— there exists only one T-tree of height having a root labeled with the type 
T:(T,Q).\JT,(T,())A(T,Q). 

— assume that for any two T-trees C and C of height less than h, either C A C 

or C AC holds. Any couple of T-lists L = (ci, ■■■C\l\) and L' = (c' l: ...c' L ,|^ 

with height h + 1 (containing T-trees one of which at least is of height h) is 
such that : 

• if L = L' then L < L' (and as well L 1 < L) 

• else \L\ \L'\ and hence either L <C L' or L' <C L 

• else \L\ — \L'\ and then 3j, Vi < j, Ci~ c\ and either Cj A c'j or A Cj. 
Either L Ai ex L' or L' Ai ex L, hence either I < L' or L' < L 

In all cases, L <C L' or L' <C L. 

— now assume that any couple of T-lists L and L' which T-trees have height 
less than h is such that either L <C L' or V <C £. Any couple of T-trees 
C = (T, (Zi, ...l nct (T))) and C" = (T, (/i, ...l nct (T)')) of height /i is such that : 

• \{C = C then C =3 C" (and as well C AC). 

• else 3j, Vi < j, k — l\ and either lj or ^ <g; Zj. As a consequence, 
cither C < (ea; C" or C <.i ex C hence either C A C' or C" < C. 

In all cases, C A C or C" A C. 

We call P(/i) the property "any couple of T-trees C and C of heigh less than h 
is such that C A C or C A C " and Q(h) the property "any couple of T-lists 
L and V which T-trees are of height less than h is such that L <C L' or L' <C L 
". We have shown that P(0) is true, and that Vft, P(h) implies Q(h) and Mh, 
Q(h) implies P(h + 1). We conclude that V/i, P(h) and Q(h), and hence that 
the relations A and <C are total orders, as are their lexicographic extensions. 

Definition 8 (Canonicity of a T-tree). A T-tree C is canonical iff it has no 
child or i/Vi, Tj(C) is sorted by A and Vc € Ti(C), c iteeZ/ is canonical. 

Proposition 4. yl T-tree is the A-minimal representative of its equivalence 
class (wrt. T-tree isomorphism) iff it is canonical. 

Proof. Let C and C be two isomorphic and distinct T-trees. Consider the fol- 
lowing prefix recursive traversal of a T-tree : 

— examining a T-tree C, is examining its lists Ti(C) in sequence. 

— examining a list Ti(C), is examining its length then, if the length is non zero, 
examining its T-trees in sequence. 

<*= We first show by induction that if, according to this traversal, two trees 
differ somewhere by the length of two T-lists, they are comparable accordingly. 
Compare C and C by performing a simultaneous prefix traversal, and stop as 
soon as we meet at depth p two lists Ti(S n ) and Ti(S' n ) with distinct lengths, S n 
(resp. S' n ) being a sub-tree in C (resp. C). Call S (resp. S') the parent T-tree of 
S n (resp. S' n ). Suppose that |T<(S n )| < \Ti(S' n )\. It follows that T 4 (S*„) < T,(S' n ). 
Since Vj < i, = TjiS'J, we have (Tj(S n ))^f n ^ ^ lex (T, (S' n ))[ S ' J and 
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Fig. 4. The first 26 T-trees ordered by A, for a problem where at most two 
objects of type D can connect to an object of type B, and two objects of types 
B and C may connect to an object of type A. The numbers of the ^-minimal 
representatives are framed. 

hence S n A S' n . Similarly, as Vj < n, S 3 = S$ it follows L = (Sj)"^ ^ iex 

(S'j)™ Ct(T) = V and hence L < V . We thus proved that if two lists Ti(S n ) and 
Ti(S r n ) of depth p are such that Ti(S n ) <§C Ti(S' n ) then the sub-trees S n and S' n 
of depth p which contain these lists are such that S n =3 S' n and thus that the 
lists L and L' of depth p—\ which contain S n and S' n are such that L <C L'. It 
follows that S and S', which are of depth p — 1 and which contain L and Z/ are 
such that S A S' and, by induction, that C A C . 

Suppose now that C is canonical (and thus that C is not). Compare C and 
C via a prefix traversal until we encounter two distinct sub-trees S n and S' n . 
As the list L' which contains S' n is a permutation of the list L which contains 
S n and since Vj < n, Sj = S'j then 3m > n, S m = S' n . As the list L is sorted 
according to A, we have S n =<! S m and thus S n ^ S' n . It follows that C =<! C. As 
the relation C A C is true VC" e Iso(C), C is ^-minimal over Iso(C). 

=4> Now suppose that C is ^-minimal over Iso(C). Prove the contrapositive 
by assuming that C is not canonical. Traverse C as usual, and stop as soon 
as two sub-trees S n and S n +i are met such that S n +i ^ S n - This necessarily 
happens since there exists at least a non sorted list of sub-trees because C is not 
canonical. Consider the tree C resulting from the permutation a which simply 
exchanges S n and S n+ \. We have C € Iso(C). As S n +i ^ S n then a(S n ) ^ S n , 
and it follows that C A C which contradicts the non canonicity hypothesis of 
C. C is thus canonical. 
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4 Enumerating T-trees 



The rest of the study proposes on one hand a procedure allowing for the explicit 
production of only the canonical T-trees, and on the other hand an algorithm 
to test and filter out non canonical T-trees. These two tools are meant to be 
integrated as components within general purpose configurators, so as to avoid 
the exploration of solutions built on the basis of redundant solutions of the inner 
structural problem of a given configuration problem. We continue in the sequel 
to call " configurations" the solutions of a structural problem . To generate a con- 
figuration amounts to incrementally build a T-tree which satisfies all structural 
constraints. 

Definition 9 (Extension). We call extension of a T-tree C, a T-tree C which 
results from adding nodes to C. We call unit extension, an extension which 
results from adding a single terminal node. 

The search space of a (structural) configuration problem can be described by 
a state graph G = (V, E) where the nodes in V correspond to valid (solution) 
T-trees and the edge {t\,t2) € E iff t 2 is a unit extension of t\. The goal of a 
constructive search procedure is to find a path in G starting from the tree (t, ()) 
(recall that t is the type of the root object in the configuration) and reaching 
a T-trce which respect all the problem constraints (i.e. not only the constraints 
involved in the structural problem). 

Definition 10 (Canonical removal of a terminal node). To canonically 
remove a terminal node from a T-tree C not reduced to a single node consists 
in selecting its first non empty T-list ?i(C) (the first according to -<t c ) then to 
select a T-tree Cj in this T-list : the first which is not a leaf if one exists, or the 
last leaf otherwise. In the first case we recursively canonically remove one node 
of Cj , in the other case, we simply remove the last leaf from the list. 

Notice that since the state graph is directed, the canonical removal of a leaf 
is not an applicable operation to a graph node (only unit extensions apply). 
Canonical removal is technically useful to inductive proofs in the sequel. 

Proposition 5. The canonical removal of a terminal node in a T-tree C not 
reduced to a single node produces a T-tree C' such that C' A C. 

Proof. Let Cj be the j th T-tree of a T-list and Cj the tree resulting from the 
canonical removal of a node in Cj . The proof is by induction over the depth p 
of the root of Cj in C. Let L and L' be the T-lists (of depth p — 1) containing 
C 3 and Cj : 

— if Cj is a single node, it is removed from its T-list, thus L' <C L. 

— else, if the canonical removal of a node of T-tree Cj of depth p produces a T- 
tree Cj such that Cj < Cj then (Ci, . . . Cj, . . .) < (Ci, . . . Cj-uCj, . . .) 
and thus L' <C L. 
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In both cases, L being the only T-list of C modified to obtain V (which trans- 
forms C in C"), the same rationale leads to C A C. 

Proposition 6. Let G be the state graph of a configuration problem. Its sub- 
graph G c corresponding to the only canonical T-trees is connex. 

Proof. It amounts to proving that any canonical T-tree can be reached by a 
sequence of canonical unit extensions from a T-tree (i, ()), or that (taken from 
the opposite side) the canonicity of a T-tree is preserved by canonical removal. 
We proceed by induction over the height of T-trees. 

— Let r be the depth of removed node. By definition of the canonical removal, it 
occurred at the end of its T-list, which hence remains sorted after the change, 
and the parent T-tree (of depth i — 1) remains canonical, since nothing else 
is modified in the process. 

— Now we show that whatever the value of p, if the canonical removal of a node 
in a T-tree C of depth p preserves the canonicity of C, then the T-tree of 
depth p— 1 which contains C is remains canonical. By the proposition^] the 
canonical removal of a node in a T-tree C produces a T-tree C' such that 
C ^ C. Canonical removal operates by selecting the first T-tree in a T-list 
that contains more than one node. If C is not the last T-tree of its T-list, 
call C right the T-tree immediately after C in the T-list. As C A C, we still 
have C A C r i g ht- If C is not the first T-tree of its T-list, we call C; e /t the 
T-tree immediately at the left of C in the T-list. As C is the leftmost T-tree 
containing more than a node, C; e /t contains a single node, with the same 
root label as C and C". Since C contained more than one node, G' contains 
at least a node and C; e /j A C' . Consequently, the canonical removal of a 
node in a T-tree (of depth p) of a T-list (of depth p — 1) leaves the T-list 
sorted. And the T-tree of depth 1 which contains this T-list, which is the 
only modified one, thus remains canonical. 

We conclude that canonical removal preserves the canonicity of all the sub-T- 
trees, whatever their depth in the T-tree. By this operation, a T-tree remains 
canonical. The sub-graph G c is thus connex. 

It immediately follows a practically very important corollary : 

Corollary 1. A configuration generation procedure that filters out the interpre- 
tations containing a non canonical structural configuration remains complete. 

Proof. According to the proposition |SJ to reject non canonical T-trees does not 
prevent to reach all canonical T-trees, since each T-tree can be reached by a 
path sequence of canonical unit extensions from the empty T-tree. 

It thus suffices to add to any complete procedure enumeration of T-trees a canon- 
icity test to obtain a procedure which remains complete (in the set of equivalence 
classes for T-tree isomorphism) while avoiding the enumeration of isomorphic 
(redundant) T-trees. 
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5 Algorithms 



A test of canonicity straightforwardly follows from the definition of canonicity. 
It is defined by two functions : Canonical and Less listed in pseudo code by the 
figure|SJ We note ct(T) the list of component types of T, sorted according to -<t C i 
and by extension, as the labels of nodes of a T-tree are types, we generalize these 
notations to ct(C) for a given T-tree C. Note that the function Less compares 
T-trees with the same root type. 



function Canonical(C) 

{returns True iff C is canonical} 
begin 

if C is a leaf then return True 
Let ct(C) = (Ti,...,T fc ) 
for i := 1 to k do 

Let (oi, . . . ai) be the list T 4 (C) 
for j :— 1 to I do 

if not(Canonical(aj)) then 
return False 
for j := 1 to I — 1 do 

if not(Less(aj, fflj+i)) then 
return False 
return True 
end function 



function Less(C, C ) 
{Returns True iff C < C'} 
begin 

if C is a leaf then return True 
if C is a leaf then return False 
Let ct(C) = (Ti,...,T k ) 
for i := 1 to k do 

Let (oj, . . . , a\ a ) be the list Ti(C), 
Let (b\,..., 6jJ be the list Ti{C) 
if (l a < lb) then return True 
if (l a > lb) then return False 
for j := 1 to l a do 

if (Less(a*, bl) =False) then 
return False 
return True 
end function 



Fig. 5. The functions Canonical and Less 



5.1 Complexity 



The worst case complexity of the function Less is linear in n (0(n)), n being 
the number of nodes of the smallest T-tree. It is called at most once on each 
node. The function Canonical is of complexity <9(nlogn) in the worst case. It 
recursively calls itself for each sub-tree of its argument and tests that their T- lists 
are sorted via a call to Less. 



5.2 Applications 

The algorithm described by the figure |S] can be used as a constraint to filter 
out the non canonical solutions of the structural sub-problem of a configuration 
problem, and this is so whichever the enumeration procedure and data structures 
are used (as possibly by example within the object oriented approach described 
in [5]). II can be integrated so that the test of canonicity is amortized over the 
search, if the T-tree corresponding to the currently built configuration grows by 
unit extensions. In that case, the top part of the search made by " Canonical" , 
that operates on a T-tree that did not change, may be saved. 
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6 Counting T-trees 



In this section, we show the potentially very important benefit that results from 
the enumeration of only the canonical T-trees, compared with a standard ex- 
haustive enumeration of all possible T-trees. To this end, we count the total 
number of T-trees and of canonical T-trees in a particular case of T-trees, those 
for which each type (the label of nodes) may have children of a single type. The 
corresponding configuration problem can be so defined : p + 1 object types To, 
Ti, ... and T p that can be inter connected by the composition relations i?(T , Ti), 
i?(Ti,T 2 ), ... and i?(T p _i,T p ). T is the root type and there exists exactly one 
object with this type. We may connect from to fc objects of type Tj+i to any ob- 
ject of type Ti. These T-trees are called fc-connected. We note N Pt k (resp. M Pt k) 
the total number of fc-connected T-trees (resp. canonical fc-connected T-trees), 
of maximal height p. 



6.1 Number of fc-connected T-trees of depth p, N Pt k 

A T-tree of maximal height p can be built by connecting from to fc T-trees of 
maximal height p — 1 to a node root. The number of arrangements of i elements 
(some of which may be identical) among N p -i t k is (Np-i^Y- N p _k is thus recur- 
sively defined by : N ,k — 1 (the tree containing a single root object root, thus 
no object of Ti), Ni^ = fc + 1 (the configurations of to fc objects of type T\ 
without more children) and 

V P > 1, N pM = £ (JVi,*)' = 1 Y 1 ■ 

Then N 2 , k is in 0{k k ) and N p . k is in 0{k kP ^). 



6.2 Number of canonical fc-connected T-trees of depth p, M p ^ 

A canonical T-tree of maximal height p can be obtained by connecting according 
to =k from to fc canonical T-trees of maximal height p — 1 to a root object. The 
number of combinations of i elements (some of which may be identical) among 

/M p _i |fe +i-l\ 

M p -i.k is I i I . M p .k is thus recursively defined by : Mo.k = 1 (the 



T-tree reduced to a single node) and 



„ ( M v-i,»+i-l\ /M p - ltk +k\ (M„_ lfe + fc)! 

By the Stirling formula (n! = \Z2w.n n+ ?e~ n + e(n)), we get 

1 {M p _ hk + k)^-^ +k+ ^ 

"Ip.k — 



^ {M p ^ki Mp -^ k+h) k k +h 
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k p ~ 1 

M ljk = k+1, M 2 ,k is in 6>(4 fe ) and M Pjfc is in 9(^=r). We see that M Ptk is much 
smaller than N p ^ for big values of p and k. The tabled exhibits important ben- 
efits, even with very small values of p and k. The case p — 2, k = 2 corresponds 
to the first 13 T-trees in figure In the general case, where more than one 
composition relation exists for each type, the impact of removing redundancies 
is even more important. 



N p , k 1 M v m 


k = 1 


k = 2 


k = 3 


k = 4 


p=l 


2/2 


3/3 


4/4 


5/5 


p = 2 


3/3 


13 / 10 


85 / 35 


775 / 126 


p = 3 


4/4 


183 / 66 


221436 / 8436 


3.61 10 11 / 1.13 10 Y 



Table 1. Comparison of N Pt k and M Pyk for small values of p and k. For (p = 3, 
k = 4), we must have 4 objects of type Ti, 16 objects of type T2 and 64 objects 
of type T 3 . 

7 Conclusion 

Configuration problems are a difficult application of constraint programming, 
since they exhibit many isomorphisms. We have shown that part of these iso- 
morphisms, those stemming from the properties of a sub-problem called the 
structural problem, can be efficiently and totally tackled, by using low cost amor- 
tizablc algorithm, so as to explore the only configurations built upon a canonical 
solution of the structural sub-problem. We have also theoretically computed the 
numbers of canonical and non canonical solutions of a simplified problem, show- 
ing that in this case already, there are much fewer canonical than non canonical 
configurations. 

These results extend the possibilities of dealing with isomorphisms in config- 
urations, until today limited either to the detection of the interchangeability of 
all yet unused individuals of each type or to the use of counters of non config- 
urable object counters (as in the ILOG software products ; 8 ). Both approaches 
share the limitation of not dealing with the structural bases of interchangeability 
(for example, in the case 14 of the figure 0] the two "B" are interchangeable, 
since they form the root of two equal trees, placed in the same context (under 
the same "A"). The "D" which appear underneath are also interchangeable. 

Our proposal allows to target in a near future the complete elimination of 
configuration isomorphisms, without needing changes in the models (using coun- 
ters by types rather than references to objects in relations). 
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